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Two dual questions in quantum information theory are to determine the communication cost ol 
simulating a bipartite unitary gate, and to determine their communication capacities. We present 
a bipartite unitary gate with two surprising properties: 1) simulating it with the assistance ol 
unlimited EPR pairs requires lar more communication than with a better choice ol entangled state, 
and 2) its communication capacity is lar lower than its capacity to create entanglement. This 
suggests that 1) unlimited EPR pairs are not the most general model ol entanglement assistance 
for two-party communication tasks, and 2) the entangling and communicating abilities ol a unitary 
interaction can vary nearly independently. The technical contribution behind these results is a 
communication-efficient protocol for measuring whether an unknown shared state lies in a specified 
rank-one subspace or its orthogonal complement. 



Introduction. Many basic questions in quantum infor- 
mation theory can be phrased as determining the rates 
at which standard communication resources (EPR pairs, 
noiseless qubit channels, etc.) can be converted to and 
from more specialized resources (such as an available 
noisy channel, or computation of functions of interest 
with distributed inputs). Typically local operations are 
allowed for free; sometimes entanglement is as well. For 
example, channel capacities are the maximum rates at 
which noisy channels can be turned into noiseless ones, 
while the quantum communication complexity of a func- 
tion / is related to the minimum rate at which noiseless 
quantum communication is turned into evaluations of /. 

In quantum mechanics, the most general interaction 
between two systems, given sufficient isolation from the 
environment, is a bipartite unitary quantum gate U. We 
will think of the systems (A and B) as each comprising n 
qubits, and as being held by two parties, Alice and Bob. 

A fundamental goal of quantum information process- 
ing is to simulate interactions (i.e. unitaries) using as few 
resources as possible. This Letter investigates these sim- 
ulation costs when different types of entanglement are 
given for free. We will define e (U) to be the number 
of bits of classical communication necessary to simulate 
U up to error e if Alice and Bob are allowed to start with 
an entangled state of their choice. (Given free entan- 
glement, the quantum and classical communication costs 
differ by a factor of exactly 2, due to teleportation [l| and 
super-dense coding The canonical form of entangle- 
ment is the EPR pair, since it can be converted to many 
copies of any other state using an asymptotically vanish- 
ing amount of communication per copy 0] . Accordingly, 
we also let C^™ e (U) denote the classical communication 
cost of simulating U up to error e given unlimited EPR 
pairs. 



Also of interest is the effectiveness of unitaries at send- 
ing classical messages or generating entanglement. The 
ultimate limit to which this can be done is given by the 
rate achievable with an asymptotically large number of 
uses and vanishing error (previously defined in Q). Note 
that these unitaries can communicate in either direction, 
or both simultaneously. We are primarily interested in 
the combined rate in both directions (as with simulation 
costs). Let C^ e (U) and Cf™(U) denote the largest 
number of bits that U can transmit in a single use up 
to error e, when allowed arbitrary entanglement or free 
EPR pairs, respectively. The corresponding asymptotic 
capacities are denoted C™*({7) and C^ a p R (C/). (Previ- 
ous works 0, [j| used the notation Cf(U) for the latter 
scenario.) Likewise, let E cap (U) denote the asymptotic 
entanglement capacity. Naturally, simulation costs are 
upper bounds to communication capacities. 

We might reasonably expect that these capacities re- 
flect the interaction strength of the unitaries, and thus 
if one capacity is large, the others should be as well. 
For example, a gate that communicates well in the for- 
ward direction ought to also do so in the backward di- 
rection, and a highly entangling gate should also disen- 
tangle or communicate a lot. This is indeed the case for 
some well-studied unitaries (e.g., CNOT, swap, and uni- 
taries close to the identity). Additionally, it has been 
proven that if one of these capacities is positive, the oth- 
ers are as well and that communication capacities 
are generally lower bounds of the entanglement capac- 

ity {ctlliu) = C?™(U) < E cap (U) + B cap (f/t)) 0,1. 

However, beyond the above proven bounds, little support 
was found for the intuition. More recently, Ref. 0] finds 
gates exhibiting arbitrarily large differences between en- 
tanglement and disentanglement capacities, (see also Q), 
and between forward and backward communication ca- 
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pacities. In this paper, we demonstrate the remaining 
separation: an arbitrarily large difference between entan- 
glement capacity and communication capacity. Together 
with the results of [f|, this indicates that most unitary 
gate capacities of interest can vary nearly independently. 

The gate U . For our gate U, A and B each have d+1 
dimensions (or equivalently, n = log(c?+l) qubits) and a 

basis given by {|0), • • • , \d}}. Let |$) = ^= ( |11) + h 

\dd) ) and P = |00)(00| + |$)($|. Define 

U = |00)($| + |$)(00|+J-P. 

In other words, U swaps 1 00} with |$) and leaves the rest 
of the space (i.e. the support of I — P) unchanged. Note 
that U = W. 

We consider this gate U because it can certainly create 
or remove log d rs n ebits but it leaves most of the space 
unchanged. This latter property will allow us to simulate 
U with little communication, implying upper bounds on 
its communication capacity. 

The simulation protocol W . Define \<t>~) = ( |$) — 

1 00) ). Note that U has only 1 nontrivial eigenvalue, — 1, 
and the corresponding eigenvector is | </>-). Let Mi be 
the ideal coherent measurement that maps |0_)|O) — > 
\4>-)\0) and 10)10) -> \(f>)\l) if (<f>\<f>_) = 0. Mi is a 
2-outcome measurement with POVM elements Mq = 
\<j>J){<j>_\,M x = I - |(/)_)(0_|. The protocol W simu- 
lates U by using a nonlocal state identification procedure 
M a (described below) that will make use of l^-)®™ -1 to 
approximate Mi- W has 5 steps: 

1. Adjoin ancillas l^.)®™" 1 . 

2. Apply M a . Store the outcome 0/1 in a qubit C in 
Bob's possession (WLOG). We will prove later that 
M a differs from Mi in the diamond norm 8] by no 
more than 0(m -1 / 2 ) using the catalyst l^-)®" 1 " 1 and 
log(m) qubits of communication in each direction. 

3. Apply the gate Diag(— 1, 1) to C, so that |0) is mapped 
to — 10) and |1) mapped to |1). 

4. Reverse M a in step 1, so as to coherently erase the 
outcome in C . This step also requires log(m) qubits 
of communication in each direction. 

5. Discard the ancillas and system C. 

Procedure for nonlocal state identification M a . 

We start with an informal description of the task, ig- 
noring locality constraints. Suppose we want to know 
whether or not an unknown incoming state \j3) is equal 
to some other state |a), and we have possession of m— 1 
copies of | a). One (approximate) method is to project 
| a )®m-i|^ onto the symmctr ic subspace of (C d )® m (de- 
fined as the span of all vectors of the form \ tp) m for \ip) g 
C d ). This defines a two-outcome measurement with mea- 
surement operators II sym := ^ J^weS^, anci 1 ~ n sym - 



(Here S m is the group of operators that permute the m 
registers.) The outcome corresponding to n sym occurs 
with probability {af^ 1 ^ ± E^jr {af" 1 " 1 ^). A 
fraction — of the permutations fix the m th register. For 
each such tt, (af™' 1 </3|7r|a) 0m_1 |/3) = 1. The re- 
maining 1— — fraction of the permutations swaps the 
m th register with one of the others. In this case 
{af™- 1 ^^)®™- 1 ^) = \(a\[3)\ 2 . Thus the probabil- 
ity of obtaining n sym is i + (1-£)| (a\f3) | 2 = \(a\f3}\ 2 + 
^ (1— I I /?) | 2 ) , and the procedure simulates the measure- 
ment with operators {|a)(a|,7 — |a)(a|} up to error at 
most 1/m. 

Observe that instead of tt ranging over all m! permuta- 
tions, it would suffice to take only the m cyclic permuta- 
tions. For the multi-partite setting, this will allow us to 
save dramatically on communication. We now describe 
the bipartite protocol and derive a careful bound on the 
accuracy. 

Let \s) = Sjlo 1 \j) an( l 5" be a register prepared 
in the state |s). Let Y act onS® (c d )«> m by mapping 

1.7)1^1)1^2) ■■■ \i>m) to 1^)1^1-^)1^2-1) ■■•|^m-i>) With 

arithmetic done mod m. That is, S controls a cyclic 
permutation of the m registers, taking the first register 
to the j th one if the state of S is \j— 1). 

With a slight abuse of notation, let Mi and M a be 
the ideal and approximate coherent state identification 
protocols for some bipartite state |a), with the answer 
residing with Bob. The state to be measured lives in 
systems AB. Alice and Bob already share |a)® m_1 in 
A 2 B 2 <S> ■ ■ ■ ® A m B m . M a is given by: 

1. Alice prepares a register S in the state \s). 

2. Alice applies Y on S <E> A ® A 2 ■ ■ ■ A m (i.e. she applies 
the S'-controlled cyclic permutation on her halves of 
the m bipartite systems). 

3. Alice sends S to Bob using log(m) qubits of forward 
communication. 

4. Bob performs Y on S (8 B ® B2 ■ • • B m thereby com- 
pleting the S'-controlled cyclic permutation on the m 
bipartite systems. 

5. Bob coherently measures S with POVM {|s)(s|,J — 
|s)(s|}. The final outcome is written to a register C 
in Bob's possession. 

6. Bob performs Y^ on S <£> B ® B 2 ■ ■ ■ B m . 

7. Bob sends S to Alice using log(m) qubits of backward 
communication. 

8. Alice applies F t on S <£> A ® A 2 ■ ■ ■ A m . 

We now show that M a approximates Mi in the follow- 
ing sense. The diamond-norm of a superoperator 5* is 
defined as ||<S||o:=max^>o,trV>=i II (^ ®<S)C0)l|i- We will 
show that || .M a — Mi\\o < ■j=. Consider the state 
\4>) = ^/p\a Q ) R \a) A B + Vl-p\ai)ii\<x±)AB, where R is 
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a reference system that may be entangled with the in- 
coming systems AB, (a±\a)AB — 0, and |ao), are 
unit vectors that are not necessarily orthogonal to one 
another. This is the most general initial state. Evolving 
\<j>) according to A4 a gives a final state |fin) = 

Vp|ao)|a)^ m k)|0) + VT^|ai)K)|a)^ m - 1 | S )|l) + |err) 

where |err) = 

^Ev^ = pl a i)l«) 0i " i 'l«-L>l«> 0Tn ~ 1 " (, '" , '' ) li / >l-) 

jj' 

and |— ) = -j= ( |0) — 11) ). The first two terms in |fin) are 
precisely the state |cor) obtained by applying Mi to \<j>). 
The last term |err) represents the deviation. The deriva- 
tion is routine and is deferred to the appendix. When cal- 
culating | (cor | err) |, only terms with j = f contribute to 
the inner product. There are m such terms, all being the 
same, giving the bound |(cor|crr)| < and matching 

precisely the probability of failure given by the informal 
argument. It also gives |(cor|fin)| > 1— > 1 — 

We are now ready to apply the well known relation 

ill |o>(o| - \b)(b\ \u = v/i-K#)l 2 < V2(H# 

to bound \\M a — M.i\\o which is equal to 

= Bu P |i(i®M»)(iM0i) - (i^Mommh 

10) 

= sup || |cor)(cor|-|fin)(fin| || x < J= . 

\<t>) 

Returning to the protocol W that simulates U, if we 
replace the two uses of M a by M.\, we obtain an ex- 
act implementaion of U. By the triangle inequality, 
\\U-W\\ < 2||7W a — A^iHo < ^f. For W to simulate U 

with accuracy e, it suffices to take m = The simula- 
tion consumes 2 log m qubits of communication in each 
direction. Thus we have the following. 

Theorem 1 C^ e (U) < 24 + 16 log \ . 

Here U is implicitly parameterized by the system size 
n, yet the simulation cost is independent of it. Note as 
well that the nonlocal state identification protocol gener- 
alizes straightforwardly to more than two remote parties 
(say, k). One way to do this is for one party to create \s) 
which is then circulated among all parties. Another way 
is to have the k parties sharing \s) — -^Y^=o b')® fc i 
each sends his share to the party designated to have the 
answer, and has the share returned to complete the pro- 
tocol. Next we prove two results based on the simulation 
protocols and Theorem [TJ 

Consequence 1: EPR pairs are suboptimal for 
simulation cost. 

Let 8 := ^4e. We claim that 
CZ%U) > A t := 21og(dM+log((l-2 ( 5)(l^) 2 ) . (1) 



Proof. Let \<p) = ^(|$} AB ®|00} A , B , + |00) AB ® |$W)- 
We consider the state-change from \ip) to Uab ( £>Ia i b'\<p}- 
Recall that |$) = -73 (|H) +•••+ \dd)), so \<p) is just 
a maximally entangled state of Schmidt rank 2d. By 
Corollary 10 of Ref [§], preparing Uab ® Ia'B'W) up to 
fidelity 1 — e from any maximally entangled state requires 
an amount of communication at least as large as A e . 
Therefore, simulating U up to error e given unlimited 
EPR pairs requires A e bits of communication, contrary 
to the 0(log -) bits of communication in Theorem[T]when 
O(-ij) copies of \4>-) are available. □ 
Note that any n x n-qubit unitary can be trivially sim- 
ulated with EPR pairs and An bits of communication 
by teleporting Alice's input to Bob, having him apply 
the unitary and then teleporting her system back. Thus, 
Eq. ([1]) implies that even given unlimited EPR pairs and 
allowing a small error, simulating U is at least half as 
costly as simulating a completely general unitary onnxn 
qubits. 

Consequence 2: Some gates can entangle expo- 
nentially more than they can communicate. 

Since U\00) = | $), we can bound E cap (U) > log(2"-l) « 
n. On the other hand, we have: 

Theorem 2 For any c > 2 and for all n sufficiently 
large, C%{U) < 8c log n. 

When communicating using a gate in both directions 
simultaneously, there is generally a tradeoff between the 
forward and bacward communication rates. The one- 
way capacity in each direction is an extreme point of 
that tradeoff. We denote these capacities by C™p _>(17) 
and <_(£/)• Theorem [2] can be proved by showing 
C™p,— ,.(Z7) < 4c log n, since the symmetry of U means 
that the same bound applies to C™* <_(U), and finally we 
can bound C^(U) < C^(U)+C^(U) < Sclogn. 

Proof o}C™l ^{U) < 4c log n. 

The nonlocal state identification protocol A4 a uses 
shared entangled states between Alice and Bob and log m 
qubits of communication in each direction, and the pro- 
tocol W that simulates U uses Ai a twice, W uses 2 logm 
qubits of forward communication. But back communica- 
tion and shared entanglement cannot increase the classi- 
cal capacity of a noiseless forward quantum channel be- 
yond the superdense coding bound [lOj, thus 

C c e ^(W)<41ogm. (2) 

It remains to show that C^_>{W) « C*™* _^{U) if \\W - 
(7||o is small. To make this quantitative, we prove the 
following continuity bound in the appendix. 

Lemma 3 If Mi, Mi are bidirectional channels with out- 
puts in C d+1 <g> C d+1 such that \\Mi - M 2 \\o < e, then 

\C c c :l^(Mi) - C™l^(M 2 )\ < 8elog(rf+l) + 4iT 2 (e) 

where H2 is the binary entropy function. 
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Our continuity bound means that the more accurate 
Ms, is, the closer the capacities of U and W are. On 
the other hand, making M a more accurate requires more 
communication. Thus we face a trade-off between keep- 
ing the capacity of W small and keeping the capacities 
of U and W close to each other. Optimizing will give us 
a bound of O(logn) bits on the capacity of U. 

Completing the proof o/C™* _>(£/) < 4c log n. 

Recall that the accuracy of the approximate nonlocal 
state identification in terms of the communication cost 
is 77 = -^=, and that ||U— W||« < 2r\ = e. According 
to Lemma [3j since log(d+l) = n, the difference in the 
capacities of U and W is suppressed if m — n c for c > 2. 
More precisely, 

C^(U) < C™l^(W) + 16r7log(d+l) + AH 2 (2 V ) 

< 41ogm + 1 67771 + 4y^2^ 

< Aclogn + wV^n 1 - / 2 + 8-2°- 75 n~ c/4 

where each term is bounded by the corresponding term 
in the subsequent line (and H 2 (x) < 2^/x). □ 

Extensions. Our simulation procedure allows us to sim- 
ulate any bipartite gate with r non-trivial eigenvalues 
using 0(rlog(r/e)) qubits of communication. This is ac- 
complished by testing the state held by Alice and Bob 
sequentially against each of the r corresponding eigen- 
vectors. Each individual test needs to have error e/r so 
that the total error can be bounded by e. This simula- 
tion method is useful for r <C log(d) (since a gate can be 
trivially simulated using logc? qubits of communication 
in each direction). 

Regarding unitary gate capacities, we have shown that 
C™p(J7) can scale like the logarithm of E cav (U). How- 
ever, it is unknown how much further this result could be 
improved. For our example, it is possible that C™p(J7) 
can be upper-bounded by a constant even as n — > 00. 
Moreover, it is possible that even stronger separations 
are possible. Bound 1 of implies that C^(Z7) > 
whenever E cap (U) > 0, but even for fixed dimension no 
nonzero lower bound on C™*(C/) is known. The difficulty 
is that the proof in 4] relates C™p(t/) to the amount of 
entanglement which one use of U can create from unen- 
tangled inputs. This quantity can be arbitrarily smaller 
than E cap (U) even for fixed dimensions. 
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PROOFS 



Deriving the state evolved by A4 a 

We use all the notations denned in the main text. In the proof of ||A4 a — A4i|| < -^=, we claim that the output state 

of applying M a to the most general initial state \<f>) = y/p \ao)R\ct)AB + V^—P \ a i) R\ a A-) ab is of a certain form. Here 
is a justification of this fact. The state after attaching the ancillas is: 

Vp\a )\a)® m \s) + v /I=£|ai)|ax>|a>® m - 1 |*> . 

After Alice applies Y, communicates S to Bob, and Bob applies Y: 

Vp|ao)|a)^| S ) + v / T^I«i)^El«)^l^)l«)^ W b')- 

3 

Now Bob attaches |0)c and makes the coherent measurement on S, taking |s)|0)c — * |s)|0)c and |s_l)|0)c — > |s_l)|1)c 
for all (sj_|s) = 0. To write down the resulting state, we should rewrite each \j) in the Fourier basis which includes 
\s). But to obtain just a bound, we can simply express \j) = ^=|s) + \sj) where (sj\s) = 0. The measurement 

on S thus results in the state 

3 

Here, the second occurrence of the |s)|0) term (the one in the parenthesis) represents an erroneous measurement 
outcome. We add and subtract -i=|s)|l) in the parenthesis: 

Vp|ao}|«r"|s}|0) + 7l = Pl«i>7bEl a > 0J >-L>l a >®" , " 1 " j ( ^fe!«>ClO> — 11>) + ) . 

3 

Rearranging, we get: 

Vp\a )\a)® m \s)\0) + ^\ ai )-^J2\a)®i\a 1 _)\a)® m - 1 -i\j)\l) 

3 

3 

where the first line is what an ideal measurement will produce (with unit norm), and the second line represents an 
error term (and it is not orthogonal to the ideal portion, since the sum is also normalized). Now, Bob applies Y^ and 
sends S back to Alice, who then applies Y\ resulting in the final state |fin) = |cor) + |err) where 

|cor) = VP|a o >H^ m | S )|0)c + v ^^|a 1 )| a± )| a )^ 1 | S )|l) c 



l err > = ^ E |ai)l«)^- j V±}|a) 0m - 1 -°'- j ' ) li / )|-}c 

33' 

as claimed. 

We can bound |||crr)|| 2 by inspecting the expression right after the rearrangement, which gives |||crr)|| 2 < "^ 2 ^- ^ . 

This implies |(cor|fin)| > 1— |(cor|err)| > 1— ^~t=^- Alternatively, we can explicitly calculate |(cor|err)| using their 
expressions given above. Only the j = j' terms contribute to the inner product. But there are m such terms, all 
being the same, giving the slightly better bound |(cor|err)| < ^7=^ and matching the probability of failure given by 
the informal argument. 
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Proof of Lemma [3j Our proof will closely parallel that of Lemma 1 of [5( , which is similar to the above but holds 
for the case when M\ and A/2 are isometries. The main ingredient in both proofs is a single-shot capacity formula for 
bidirectional channels, first established for isometries in [4J, but then extended to arbitary bidirectional channels in 

C^(W) = sup I(X;BB') w{p) -I(X;BB') p . (3) 

pXAA'BB' 

Here A,B are the registers acted on by W, A',B' are ancillas of arbitrary dimension, X is a classical register, 
I(X; Y) = H(X) + H{Y) — H(XY) is the quantum mutual information of the state given by the subscript. H(R) = 
H(a) = — trfrloger is the von Neumann entropy for the reduced density matrix a on the system R. When one of 
the registers X is classical, the state on XY represents an ensemble of quantum states on Y labeled by basis states 
of X, and the quantum mutual information is the Holevo information [12|. Eq. ([3]) can be interpreted to mean that 
CJap,— >[W) equals the largest single-shot increase in mutual information possible when applying W to any ensemble 
of bipartite states. Due to Eq. 

C e 4^(U) ~ C^{W) < W BB') u(p) - I(X; BB') w[p) (4) 

where p attains the supremum in the expression for C™* _>(i7) to some arbitrary precision. (This precision parameter 
is independent from all other parameters considered, and thus will be omitted for simplicity.) 

Thus the desired continuity bound is essentially a continuity result for quantum mutual information. The crucial 
challenge is the lack of dimensional bounds on the systems X and B', so that Fannes inequality [3] does not provide 
the needed continuity result. Instead, we use a generalization due to Fannes and Alicki [14| that applies to conditional 
entropy: 

\H(Y\Z) a - H(Y\Z) a ,\ < 4elogd + 2H 2 (e) , 

where e = \\a — cr'\\i and d = dimY". Remarkably, this Fannes- Alicki inequality provides an upper bound that is 
independent of the size of the conditioned system Z. 

Returning to Eq. flU), first note that if \\W - U\\ < e, then \\W{p) - U(p)\\ x < e. Next, we can expand I(X;BB') 

as 

I(X\ BB') = H(B') + H(B\B') - H{B\B'X) - H(B'\X) . 

We now bound the difference of each of the above terms when evaluated on W(p) and U(p). The H(B') and H(B'\X) 
terms are the same for both states since W and U act only on A, B. Applying the Fannes-Alicki inequality to the 
remaining two terms and using dim B = d + 1 establishes the Lemma. 



